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Outline of Presentation 


• Investigative Approach 

• An Optocoupler’s Tale 

* On the Matter of Small Probabilities 

• What’s with the Noise Spikes? 

* The Meaning of an Upset in a Fiber Optic Link 

* Considerations 



Latent damage sites: device did not fail during ground irradiation , 
but at some time afterward during operation. 

Could this have been observed in-flight? 
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Anomaly Resolution - 

Root Cause Investigation for Radiation Engineers 


• Determine orbital location and time of event 

- Look for the obvious such as solar events or South Atlantic 
Anomaly (SAA) 

• Review electronic parts list for potential sensitive devices 

• Review identified device in specific circuit application 

- Factors such as duty cycle, operating speed, voltage levels, 
and so forth 

• Obtain existing SEE, dose, and damage data or gather new 
data 

- Compare applications between in-circuit and ground data 

- Perform ground testing if needed 

• Determine risk probabilities 

- SEE rates, etc 

- Failure potential 

• Recommend mitigative action(s) if possible 
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An Optocoupler’s Tale - Background 



Optocouplers 

- Used extensively for the isolation of signals between 
systems or boxes 

- Translate electrical signals to optical, then back to 
electrical 

What radiation-induced failure modes may exist? 

- Long-term degradation such as current transfer ratio 
(CTR) - output/input 

- Single particle events 

* Photodiodes, for example, have a history of being used as 
energetic particle detectors! 


Input 
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Typical Block Diagram of an Optocoupler 
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An Optocoupler’s Tale - 
NASA’s Most Famous Science Spacecraft 



Hubble Space Telescope (HST) 

- Flying for over 18 years 

- Tremendous scientific discoveries (as well as gorgeous images!) 

HST has had several servicing missions (SM) 

- New instruments 

- System upgrades and maintenance 

On the SM2, launched Feb 14 th , 1997, two new instruments were 
installed 

- Multiple anomalies were observed during the on-orbit engineering 
calibration for these instruments 

- HST’s main radiation concern is SAA 
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An Optocoupler’s Tale - 
Resolving the Anomaly 



What steps were needed to determine ROOT CAUSE and action? 

- Review of environment during anomalies 

• All events occurred in the SAA 

- Review of parts list 

• Optocoupler highlighted as most likely candidate 

- Review of circuit application 

• SETs simulated showing possible cause 

- SET could trigger a high-voltage portion of the instrument and cause failure 

- Review or gather radiation test data 

• No data existed; accelerator test performed 
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The Optocoupler- Final Analysis 

What steps were needed to determine ROOT CAUSE and action? - 
continued 

- Determine risk probability (i.e., upset rates) 

• Optocouplers are not just electrical 

• Considerations for tools beyond CREME96 began with this and 
related work 

- Determine actions to mitigate or reduce risk 

• In-flight hardware is not easily modified ;o( 

- FPGAs improve this ability (but not here) 

• Operational change installed via software update 

- No instrument operation during SAA 

- Critical science was NOT impacted, but some science data loss 
incurred 
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On the Matter of Small Probabilities - 

Background 

• Solid State Recorders (SSRs) 

- A means for storing science data on-board a spacecraft 

- Use high-density memory ICs for density/power advantages 

• SRAM (early 1990’s) 

• DRAM (mid-1990’s and later) 

• Flash (being considered) 

• DRAMs: What radiation-induced failure modes may exist? 

- TID 

• Traditional leakage increases, cell failures, etc... 

- SEE 

• Destructive: SEL, stuck bits 

• Upset: bit/multiple bits, block errors, mode errors, SEFI 


1 Gb SDRAM circa 2006 
Feature size is 90nm 
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On the Matter of Small Probabilities - 

NASA’s Most Famous Science Spacecraft (yet again!) 

On the SM2, Feb 14 th , 1997, a new SSR was 
installed to increase data storage capacity 

- HST passes through the SAA several times daily 

• Bit upsets tracked fairly well with predicted rate based on 
ground data (3 samples, one proton energy) 

• HOWEVER, two more complex anomalies were observed 

- Each had - 100 bits in error (block) 

- Block was not corrected by a re-write 

- Project in panic! 


HST SSR utilizes 
Irvine Sensors DRAM Modules 
Comprised of 16 Mb IBM Luna DRAMs 
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On the Matter of Small Probabilities- 

Resolving the Anomaly 

What steps were needed to determine ROOT CAUSE and 
action? 

- Review of environment during anomaly 

• SAA 

- Review of parts list 

• Memory controller was rad-hard 

• DRAM was not 

- Review of circuit application 

• Circuit application was the same as in ground testing (refresh rate, 
etc) 

- Review or gather radiation test data 

• Proton data: no observed block errors (sample size = 3 w/ lx 
environment fluences) 

• HOWEVER, heavy ion data exhibited these type of events at low LETs 

- Proton events would be expected 

- New test data required for statistics on 1440 device usage 

• With 1440 devices being used for this SSR application 

- Expected event cross-section of ~a few E-13 cm 2 based on 2 events in 9 
months versus (predicted) in-flight proton fluence 
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On the Matter of Small Probabilities- 

Final Analysis 


Review or gather radiation test data (cont’d) 

- New test undertaken with protons with 100 die and to higher 
proton fluence levels 

• 9 events observed with proton fluences ~100x over expected HST 
expected levels 

- 2 different event signatures noted 
» block (column/row) errors 

» weak columns (suspect data - sometimes good, sometimes bad) 

Determine risk probability (i.e., upset rates) 

- Predicted error rate of 2.2/yr is the same order of magnitude as 
observed 


Determine actions to mitigate or reduce risk 
- Reset of mode register or power cycle clear the anomaly 


• Circuitry not included to provide reset 

• Power cycle determined to be feasible when needed 



- Data is Reed-Solomon (RS) Encoded 

' r 

» Probability of RS failure is low 


- No action taken at that time 

' * * 
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What’s With the Noise Spike? - Background 



• Linear devices such as analog comparators are 

- Used extensively in instruments, power, data collection, and 
more 

- Compares the voltage levels between two analog signals 

• What radiation-induced failure modes may exist? 

- Long-term degradation is focused on 

• Enhanced low dose rate sensitivity (ELDRS) and displacement 
damage (in bipolars) 

- Single events 

• Single event transients (SETs) are the prime concern. 



Time (s) 
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What’s With the Noise Spike? - 
Microwave Anisotropy Probe (MAP) 



• Launched June 30, 2001. 

- Had phasing orbits prior 
to insertion in final orbit. 

• Reached its final orbital 
position on L2 end of 
September, 2001. 

• An anomaly occurred 
causing a reset of the 
spacecraft processor on 
November 5, 2001. 
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What’s With the Noise Spike? - 
Resolving the Anomaly 



What steps were 
needed to 
determine ROOT 
CAUSE and action? 

- Review of 
environment during 
anomaly 

• Solar event 

- Significant heavy 
ion component 

- Review of parts list 

• Analog comparator 
(PM/LM139) identified 
as likely problem 


GOESS Proton Flux (5 minute data) Begin: 2001 Nov 4 0000 UTC 



Nov 4 Nov 5 Nov 6 Nov 7 

Universal Time 


Updated 2001 Nov 6 23:56:0+ UTC NOAA/SEC Boulder, CO USA 

Data from NOAA/SEC/SWO 



LET (MeV-cm 2 /mg) 

after Dyer, 2002 
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What’s With the Noise Spike? - 
Resolving the Anomaly (2) 




- Review of circuit application 

• Confirmed that LM/PM139 could be the cause 

• Application had changed since initial parts review pre-launch 

- Review or gather radiation test data 

• No documented proton sensitivity 

• Heavy ion sensitivity documented as a function of the 
application using existing data plus new data gathered 
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What’s With the Noise Spike? - 

Final Analysis 

What steps were needed to determine ROOT CAUSE and action? 
- continued 

- Determine risk probability (i.e., upset rates with heavy ions) 

• Additional shielding analysis performed for particle transport 

• Assumption of sensitive volume thicknesses 

- Determine actions to mitigate or reduce risk 

• Event rates deemed acceptable by project 

• No action taken 
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The Meaning of an Upset in a Fiber Optic 

Link (FOL) - Background 



FOLs 


- MIL-STD-1773 implementation (1 MHz) 
used since the early 1990’s in many 
NASA systems 

- Transmits electrical data and 
command signals to/from optical 

What radiation-induced failure modes 
may exist? 

- Similar to optocouplers 

- SEUs imply single or multi-bit errors 

• Photodiodes, have a history of being 
used as energetic particle detectors. 

• Errors are temporal via photodiode 

- Transients may affect more than one 
clock cycle 

• High-speed electrical circuits also 
sensitive 


Optical Fiber 



Note: AS 1773 is a dual redundant bus. Bus Bis not shown in this diagram. 


Representative FOL architecture 


• Major impact is on data bit error rate 
(BER) 
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Error Cross section incm2 


The Meaning of an Upset in a Fiber Optic 
Link (FOL)- Background (cont’d) 



Original MIL-STD-1773 transceivers used Si photodiodes 

- Sensitive to direct ionization from protons 

• Implies high bit error rate (BER) for space applications. 

- Angle of incidence, optical power budget, and proton energy effects 
noted 

This forced the usage of protocol fault-tolerant features to be 
implemented (message retries). 

- Used successfully in NASA misions 

• BUT reduced effective bus bandwidth by ~50%. 

• For higher data rate systems, this hardening solution may not be applicable. 
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♦ Errors observed 



■ No errors observed 


* 
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Ground data illustrating 
the effect of optical power budget 
on radiation performance 
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The Meaning of an Upset in a Fiber Optic 
Link (FOL)- Making a Better Mousetrap 



Hardening methodologies explored 

- Change of optical wavelength from 850 nm to 1300 nm light showed 
improved SEU tolerance 

• Reduced volume of photodiode 

- Receiver noise filtering techniques and optical power budgets also 
help 

- Higher data rate development (20 MHz) - AS1773 

• Flown as an experiment on Microelectronics and Photonics Testbed (MPTB) 

- Boeing DR1773 Transceivers 



ATTENUATORS 


MPTB DR1773 Test Board 
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The Meaning of an Upset in a Fiber Optic Link (FOL) 

- MPTB Performance 


• MPTB launched in 1997 

- 6 years of in-flight 
performance in a highly 
elliptical orbit (HEO) 

• Transceivers were 
operated in two modes 

- ED mode used a 
physical contact (PC) 
polished fiber optic 
terminal 

- DE mode used a flat 
polished connector (air 
gap) 

• Which do you think 
would work better? 
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The Meaning of an Upset in a Fiber Optic 

Link (FOL) - In-Flight 



• Did the hardening effort pay off? 


ED and DE bit error rates by Year 


Year 

ED BER 

DE BER 

1997 

1.738E' 12 

N/A 

1998 

4.224E' 14 

3.787E' 11 

1999 

3.855E' 14 

5.303E' 11 

2000 

0 

8.501E' 11 

2001 

8.168E' 15 

N/A 

2002 

0 

N/A 
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MPTB Transceiver DE Mode 1998-2000 Orbits 0036-3111 


Few errors were noted on the “good” PC 
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Considerations 



• Methodical process for anomaly review takes into account 

- Environment 

- Selected parts 

- Design 

- Existing radiation test data and/or new data 

- Impact (i.e., risk probability) 

- Actions (mitigative or otherwise) 

• Notes: 

- Design and parts list reviews are good for flight programs 

• BUT, any changes later in design process need to be reviewed as 
well 

- Protons aren’t always the cause of anomalies during solar 
events 

• Solar heavy ions must be taken into account 

- System design and not just device radiation tolerance needs 
to be taken into account 

• Mechanical issues, for example, can be related (as in the FOL 
example) 

- Spacecraft charging effects not discussed, but should be 
considered as well" 

• Can charging in plastic packages be the next SEU? 
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